{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7ec41d9f-61ce-4ad8-a166-13ed751733fe",
   "metadata": {},
   "source": [
    "# Query Examples\n",
    "This notebook contains a few examples of search performed on the `books` index previously created from the files [local_embedding_example](./local_embedding_example.ipynb) or [pipeline-inference-example](./pipeline-inference-example.ipynb). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36dbf3da-607e-42ff-80a4-f7a7e7601473",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -r requirements.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d7348f74-846e-4f28-b262-b2f452001e11",
   "metadata": {},
   "outputs": [],
   "source": [
    "from elasticsearch import Elasticsearch\n",
    "from getpass import getpass\n",
    "import pprint\n",
    "\n",
    "pp = pprint.PrettyPrinter(indent=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77d58474-fe09-4d9a-8b51-858ce86e3584",
   "metadata": {},
   "source": [
    "Connect to Elasticsearch with an Elastic Deployemnt Endpoint and Elastic Api Key\n",
    "\n",
    "This notebook connects to the `books-pipeline` index, as the ingestion-based scripts index all of the books by default. If you would like to use the `books-local` index instead, replace the value of `INDEX_NAME` to `books-local` below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4729052a-d72f-468a-9d17-3520ed18cc38",
   "metadata": {},
   "outputs": [],
   "source": [
    "cloud_endpoint = getpass(\"Elastic deployment Cloud Endpoint: \")\n",
    "cloud_api_key = getpass(\"Elastic deployment API Key: \")\n",
    "\n",
    "INDEX_NAME = \"books-pipeline\"\n",
    "MODEL_ID = \"sentence-transformers__msmarco-minilm-l-12-v3\"\n",
    "\n",
    "es = Elasticsearch(\n",
    "    hosts=cloud_endpoint,\n",
    "    api_key=cloud_api_key,\n",
    ")\n",
    "\n",
    "query_string = \"Dinosaurs are still alive\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b97c764-6da3-40ff-8ce9-051c1f02cb1d",
   "metadata": {},
   "source": [
    "#### The queries will return and display using the `print_results` function defined below which outlines the book title, description, and search return score."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "fd407bef-9c72-4971-83f1-6e2cf818ab4c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def print_results(search_result):\n",
    "    print(f\"Total hits: {search_result['hits']['total']['value']}\")\n",
    "    for hit in search_result[\"hits\"][\"hits\"]:\n",
    "        print(f\"Book: {hit['_source']['book_title']}\")\n",
    "        print(f\"Author: {hit['_source']['author_name']}\")\n",
    "        print(f\"Description: {hit['_source']['book_description']}\")\n",
    "        print(f\"Score: {hit['_score']}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b76adf0-1341-423e-98dc-914c1ed34c49",
   "metadata": {},
   "source": [
    "### 1. Vector Search Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1e40135-8321-4713-abb0-0bdb82894bc7",
   "metadata": {},
   "outputs": [],
   "source": [
    "vector_search_result = es.search(\n",
    "    index=INDEX_NAME,\n",
    "    knn={\n",
    "        \"field\": \"description_embedding\",\n",
    "        \"k\": 10,\n",
    "        \"num_candidates\": 50,\n",
    "        \"query_vector_builder\": {\n",
    "            \"text_embedding\": {\"model_id\": MODEL_ID, \"model_text\": query_string}\n",
    "        },\n",
    "    },\n",
    ")\n",
    "\n",
    "print_results(vector_search_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5718325-b845-45c4-94ea-6d6041863b3b",
   "metadata": {},
   "source": [
    "### 2. Full-Text Search Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "959be210-ee7b-44ed-9c75-2dab3d943dc8",
   "metadata": {},
   "outputs": [],
   "source": [
    "bm25_result = es.search(\n",
    "    index=INDEX_NAME,\n",
    "    body={\"query\": {\"match\": {\"book_description\": query_string}}, \"size\": 10},\n",
    ")\n",
    "\n",
    "print_results(bm25_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc45aca6-1243-4602-90f4-8f8cb879c087",
   "metadata": {},
   "source": [
    "### 3. Hybrid Search Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2595c6b-e997-41bf-8897-c46334de8907",
   "metadata": {},
   "outputs": [],
   "source": [
    "hybrid_search_result = es.search(\n",
    "    index=INDEX_NAME,\n",
    "    body={\n",
    "        \"query\": {\n",
    "            \"bool\": {\n",
    "                \"must\": [\n",
    "                    {\n",
    "                        \"match\": {\n",
    "                            \"book_description\": query_string,\n",
    "                        }\n",
    "                    },\n",
    "                ]\n",
    "            }\n",
    "        },\n",
    "        \"knn\": {\n",
    "            \"field\": \"description_embedding\",\n",
    "            \"k\": 10,\n",
    "            \"query_vector_builder\": {\n",
    "                \"text_embedding\": {\"model_id\": MODEL_ID, \"model_text\": query_string}\n",
    "            },\n",
    "        },\n",
    "        \"rank\": {\"rrf\": {}},\n",
    "        \"size\": 10,\n",
    "    },\n",
    ")\n",
    "\n",
    "print_results(hybrid_search_result)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
